Universal Knowledge-Seeking Agents for Stochastic Environments
نویسندگان
چکیده
We define an optimal Bayesian knowledge-seeking agent, KLKSA, designed for countable hypothesis classes of stochastic environments and whose goal is to gather as much information about the unknown world as possible. Although this agent works for arbitrary countable classes and priors, we focus on the especially interesting case where all stochastic computable environments are considered and the prior is based on Solomonoff’s universal prior. Among other properties, we show that KL-KSA learns the true environment in the sense that it learns to predict the consequences of actions it does not take. We show that it does not consider noise to be information and avoids taking actions leading to inescapable traps. We also present a variety of toy experiments demonstrating that KL-KSA behaves according to expectation.
منابع مشابه
AIXIjs: A Software Demo for General Reinforcement Learning
Reinforcement learning (RL; Sutton and Barto, 1998; Bertsekas and Tsitsiklis, 1995) is a general and powerful framework with which to study and implement artificial intelligence (AI; Russell and Norvig, 2010). Recent advances in deep learning (Schmidhuber, 2015) have enabled RL algorithms to achieve impressive performance in restricted domains such as playing Atari video games (Mnih et al., 201...
متن کاملTravel Agency Awareness of the Health Risks of International Travel; A Pilot Study
Introduction: Travel agencies may be consulted by intending travelers seeking pre-travel health advice. Travel agents should be equipped to deal with such queries and have access to a source of high quality travel health advice. This study aimed to establish the level of knowledge of travel health risks among Irish travel agencies. Methods: A web-based su...
متن کاملStochastic Nash Equilibrium Seeking for Games with General Nonlinear Payoffs
We introduce a multi-input stochastic extremum seeking algorithm to solve the problem of seeking Nash equilibria for a noncooperative game whose N players seek to maximize their individual payoff functions. The payoff functions are general (not necessarily quadratic), and their forms are not known to the players. Our algorithm is a nonmodel-based approach for asymptotic attainment of the Nash e...
متن کاملOBDD-Based Optimistic and Strong Cyclic Adversarial Planning
Recently, universal planning has become feasible through the use of efficient symbolic methods for plan generation and representation based on reduced ordered binary decision diagrams (OBDDs). In this paper, we address adversarial universal planning for multi-agent domains in which a set of uncontrollable agents may be adversarial to us. We present two new OBDD-based universal planning algorith...
متن کاملA Model-Based Goal-Directed Bayesian Framework for Imitation Learning in Humans and Machines
Imitation offers a powerful mechanism for knowledge acquisition, particularly for intelligent agents (like infants) that lack the ability to transfer knowledge using language. Several algorithms and models have recently been proposed for imitation learning in humans and robots. However, few proposals offer a framework for imitation learning in noisy stochastic environments where the imitator mu...
متن کامل